-
Notifications
You must be signed in to change notification settings - Fork 207
Enable string conversion in EUC-JP. #1296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable string conversion in EUC-JP. #1296
Conversation
|
@swift-ci Please test |
Sources/FoundationEssentials/String/StringProtocol+Essentials.swift
Outdated
Show resolved
Hide resolved
Sources/FoundationInternationalization/ICU/ICU+StringConverter.swift
Outdated
Show resolved
Hide resolved
Sources/FoundationEssentials/String/StringProtocol+Essentials.swift
Outdated
Show resolved
Hide resolved
7ae3f7f to
b0a8981
Compare
|
@swift-ci Please test |
|
@parkera @jmschonfeld Thank you for reviewing. In response to your feedback, I've made some changes:
|
|
Please let me request reviews again as per changes above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good to me, but also want @jmschonfeld to chime in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for putting this together!
96de131 to
bc1656d
Compare
Background: EUC-JP is not supported by OSS CoreFoundation, while it is supported by macOS Foundation Framework. See swiftlang#1016 This commit resolves the issue by calling ICU API if necessary.
bc1656d to
844479e
Compare
|
@swift-ci Please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - that resolves the build failures, but I have a comment about one issue in the tests that I've discovered as well (this should hopefully be the last thing, thanks for bearing with me)
| XCTAssertNil(sushi.data(using: String._Encoding.japaneseEUC)) | ||
| XCTAssertEqual( | ||
| sushi.data(using: String._Encoding.japaneseEUC, allowLossyConversion: true), | ||
| "Sushi?".data(using: .utf8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test unfortunately fails in the FOUNDATION_FRAMEWORK configuration because Foundation.framework's implementation of this conversion replaces both UTF-16 scalars of the emoji with ? rather than replacing the entire grapheme cluster with a single ?. I suspect that maybe ICU is replacing only with a single ? - do you happen to know if that's intentional? If so, for now we might need to vary the expected value to be Sushi?? in FOUNDATION_FRAMEWORK vs. Sushi? in !FOUNDATION_FRAMEWORK to account for the discrepancy between Foundation.framework and ICU (assuming we think this result from ICU is indeed correct and not a bug in how we're calling ICU)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for my overlooking again, but I'm not sure if we should implement conversion just like Foundation framework.
I mean the behavior of Foundation framework looks buggy because it doesn't seem to handle surrogate pairs correctly. (🍣 is non-BMP but consists of only one scalar U+1F363. Does that "buggy" behavior come from the historical reason that NSString has been represented as a sequence of UTF–16??)
Of course, we can imitate such behavior by replacing UCNV_FROM_U_CALLBACK_SUBSTITUTE with a custom callback function.
However, let me point out that current swift-foundation's implementation for .ascii is code-point-basis, not UTF-16-value-basis:
import Foundation
let sushiEmoji = "🍣"
print(sushiEmoji.data(using: .ascii, allowLossyConversion: true)!.count)
// -> Prints "1"Options...?
- To keep consistent with current (e.g.)
.asciiimplementation:- Divide the test case for
FOUNDATION_FRAMEWORKand!FOUNDATION_FRAMEWORK.
- Divide the test case for
- To make it compatible with Foundation framework's behavior:
- Implement a custom callback for
ucnv_setFromUCallBack.
- Implement a custom callback for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I suspect this behavior comes from the fact that NSString is natively stored as UTF-16. For now, let's just divide the test case with a FOUNDATION_FRAMEWORK/!FOUNDATION_FRAMEWORK value for this expectation since I think enabling the conversion at all on non-Darwin is a good step even if the behavior is slightly different than Foundation.framework, and I can come back with a separate change to look into making this behavior the same across both platforms (while ensuring we don't break compatibility with existing clients).
…ersion. In response to: swiftlang#1296 (comment)
|
@jmschonfeld (I didn't know CI didn't execute test cases with |
|
@swift-ci Please test |
|
Thanks for all of your updates - looks great to me now! |
* Enable string conversion in EUC-JP. Background: EUC-JP is not supported by OSS CoreFoundation, while it is supported by macOS Foundation Framework. See swiftlang#1016 This commit resolves the issue by calling ICU API if necessary. * ICU: Omit encodings that should be supported by FoundationEssentials. In response to: swiftlang#1296 (comment) * ICU: Remove unnecessary `nonisolated(unsafe)` from static property. In response to: swiftlang#1296 (comment) * Add comment to `func _icuMakeStringFromBytes_impl`. In response to: swiftlang#1296 (comment) * Delegate string conversion to ICU only when encoding is EUC-JP. In response to: swiftlang#1296 (comment) * Replace dynamic `_icu*` functions only if `!FOUNDATION_FRAMEWORK`. In response to: swiftlang#1296 (comment) * Divide test cases depending on `FOUNDATION_FRAMEWORK` for EUC_JP conversion. In response to: swiftlang#1296 (comment)
Background: EUC-JP is not supported by OSS CoreFoundation, while it is supported by macOS Foundation Framework.
See #1016
This PR resolves the issue by calling ICU API if necessary.